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Abstract 

The best known inner bound on the two-receiver general broadcast channel without a common 
message is due to Marton [3]. This result was subsequently generalized in [p. 391, Problem 10(c) 2] and 
[4] to broadcast channels with a common message. However the latter region is not computable (except 
in certain special cases) as no bounds on the cardinality of its auxiliary random variables exist. Nor is 
it even clear that the inner bound is a closed set. The main obstacle in proving cardinality bounds is the 
fact that the Caratheodory theorem, the main known tool for proving cardinality bounds, does not yield 
a finite cardinality result. Our new tool is based on an identity that relates the second derivative of the 
Shannon entropy of a discrete random variable (under a certain perturbation) to the corresponding Fisher 
information. In order to go beyond the traditional Caratheodory type arguments, we identify certain 
properties that the auxiliary random variables corresponding to the extreme points of the inner bound 
satisfy. These properties are then used to establish cardinality bounds on the auxiliary random variables 
of the inner bound, thereby proving the computability of the region, and its closedness. 

Although existence of cardinality bounds renders Marton's inner bound computable, it is still hard 
to evaluate the region. It is however shown that the computation can be significantly simplified if we 
further assume that Marton's inner bound and the recent outer bound of Nair and El Gamal match at the 
given particular channel. In order to demonstrate this, we consider a large class of binary input broadcast 
channels and compute maximum of the sum rate of private messages assuming that the inner and the 
outer bound match at the given broadcast channel. We also show that the inner and the outer bound do 
not match for some broadcast channels, thus establishing a conjecture of [15]. 

I. Introduction 

In this paper, we consider two-receiver general broadcast channels. A two-receiver broadcast channel 
characterized by the conditional distribution q{y, z\x) where X is the input to the channel and Y and 
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Z are the outputs of the channel at the two receivers. Let X, y and Z denote the alphabet set of X, 
Y and Z respectively. The transmitter wants to send a common message, Mq, to both the receivers and 
two private messages Mi and M2 to Y and Z respectively. Assume that Mi, M2 and M3 are mutually 
independent, and Mj (for i = 0,1, 2) is a uniform random variable over set Mi- The transmitter maps 
the messages into a codeword of length n using an encoding function ( : Mo x Mi x M2 X"-, and 
sends it over the broadcast channel q{y, z\x) in n times steps. The receivers use the decoding functions 
i^y : ^ Mo X A^i and iJ^ : — A^o x M2 to map their received signals to (Mq^^^Mi) 

- — ■ (2) — 

and (Mo ,M2) respectively. The average probability of error is then taken to be the probability that 

^(1) ^ — — (2) ^ — ^ 

(Mo ,Mi,Mo ,M2) is not equal to (Mq, Mi, Afo, M2). 

The capacity region of the broadcast channel is defined as the set of all triples {Rq, Ri, R2) such 
that for any e > 0, there is some integer n, uniform random variables Mo, Mi, M2 with alphabet sets 
\Mi\ > 2"(^'~^) (for i = 0,1, 2), encoding function (, and decoding functions '&y and i?^ such that the 
average probability of error is less than or equal to e. 

The capacity region of the broadcast chaimel is not known except in certain special cases. The best 
achievable region of triples (0, R\,R2) for the broadcast chaimel is due to Marton [Theorem 2 3]. Marton's 
work was subsequently generaUzed in [p. 391, Problem 10(c) 2], and Gelfand and Pinsker [4] who 
established the achievability of the region formed by taking union over random variables U, V, W, X, Y, Z, 
having the joint distribution p{u, v, w, x, y, z) = p{u, v, w, x)q{y, z\x), of 

Ro,R\,R2 > 0; 

Ro < mm{I{W;Y),I{W;Z)y, (1) 

Ro + Ri < IiUW;Y); (2) 

R0 + R2 < I{VW;Z); (3) 
R0 + R1 + R2 < IiU;Y\W) + IiV;Z\W)-I{U;V\W) 

+ mm{I{W;Y),I{W;Z)). (4) 

In Marton's original work, the auxiliary random variables U, V and W are finite random variables. We 
however allow the auxiliary random variables U, V and W to be discrete or continuous random variables 
to get an apparently larger region. The main result of this paper however implies that this relaxation 
will not make the region grow. We refer to this region as the Marton's inner bound for the general 
broadcast chaimel. Recently Liang and Kramer reported an apparently larger inner bound to the broadcast 
channel [9], which however turns out to be equivalent to Marton's inner bound [10]. Marton's inner 
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bound therefore remains to be the currently best known inner bound on the general broadcast channel. 
Liang, Kramer and Poor showed that in order to evaluate Marton's inner bound, it suffices to search 
over p{u, V, w, x) for which either I{W; Y) = I{W; Z), or I{W; Y) > I{W; Z)&V = constant, or 
I{W; Y) < I{W; Z)&U = constant holds [10]. This restriction however does not lead to a computable 
characterization of the region. 

Unfortunately Marton's iimer bound is not computable (except in certain special cases) as no bounds 
on the cardinaUty of its auxiUary random variables exist. A prior work by Hajek and Pursley derives 
cardinaUty bounds for an earUer iimer bound of Cover and van der Meulen for the special case of X is 
binary, and Rq = [5] ; Hajek and Pursley showed that X can be taken as a deterministic function of the 
auxiliary random variables involved, and conjectured certain cardinality bounds on the auxiliary random 
variables when \X\ is arbitrary but Rq is equal to zero. For the case of non-zero Rq, Hajek and Pursley 
commented that finding cardinality bounds appears to be considerably more difficult. The inner bound 
of Cover and van der Meulen was however later improved by Marton. A Caratheodory-type argument 
results in a cardinality bound of IVH-^I + 1 on \U\, and a cardinaUty bound of + 1 on |V| for 

Marton's iimer bound. This does not lead to fixed cardinality bounds on the auxiliary random variables 
U and V. The main result of this paper is to prove that the subset of Marton's iimer bound defined by 
imposing extra constraints \U\ < |V| < \X\, |>V| < l^"] + 4 and H{X\UVW) = is identical to 
Marton's inner bound. 

At the heart of our technique lies the following observation: consider an arbitrary set of finite random 
variables Xi,X2, ■■■,Xn jointly distributed according to pq{xi,X2, ■■■,Xn)- One can represent a perturba- 
tion of this joint distribution by a vector consisting of the first derivative of the individual probabilities 
Pq{x\,X2, Xn) for all values of x\, x^, x„. We however suggest the following perturbation that can 
be represented by a real valued random variable, L, jointly distributed by X\,X2, ■■■,Xn and satisfying 
E[L] = 0, |E[L|Xi = xi,X2 = X2, ■■■,Xn = Xn]\ < oo for all values of xi, X2, x„: 

Pe{Xi = Xi, ...,Xn = Xn) = Pq{Xi = Xi, ...,X„ = Xn) • (l + € • E[L|Xi = Xi, = Xn]) , 

where e is a real number in some interval [— ei , €2] . Random variable L is a canonical way of representing 
the direction of perturbation since given any subset of indices / C {1, 2, 3, n}, one can verify that the 
following equation for the marginal distribution of random variables Xi for i e I: 

Pe{Xi&i = Xiei) = Po{Xiei = Xi^i) • (l + e • E[L\Xi^j = Xi^i]). 

Furthermore for any set of indices / C {1, 2, 3, n}, the second derivative of the joint entropy of 
random variables X^ for z G / as a function of e is related to the problem of MMSE estimation of L 
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from Xi(=j: 

—H{Xi^i) |,=o= -loge-E[E[L|X,e,]2]. 

Lemma |3] describes a generic version of the above identity that relates the second derivative of the 
Shannon entropy of a discrete random variable to the corresponding Fisher information. This identity is 
to best of our knowledge new. It is repeatedly invoked in our proofs to compute the second derivative 
of various expressions. 

It is known that Marton's inner bound coincides with the best known outer bound for the degraded, less 
noisy, more capable, and semi-deterministic broadcast channels. Nair and Zizhou showed that Marton's 
inner bound and the recent outer bound of Nair and El Gamal are different for a BSSC channel with 
parameter ^ if a certain conjecture holdil]. In this paper, we provide examples of broadcast channels for 
which the two bounds do not match. 

The outline of this paper is as follows. In section |II1 we introduce the basic notations and definitions 
we use. Section |IV] contains the main results of the paper followed by section |V] which gives formal 
proofs for the results. Appendices complete the proof of Theorems from section |Vl 

II. Definitions and Notations 

Let M denote the set of real numbers. All the logarithms throughout this paper are in base two, 
unless stated otherwise. Let C{q{y, z\x)) denote the capacity region of the broadcast channel q{y,z\x). 
We use Xi-j^ to denote (Xi, X2, X^); similarly we use Yi-]^ and Zi-^ to denote (Yi, I2, ^fc) and 
(Zi, Z2, Zfc) respectively. 

Definition 1: For two vectors vl and V2 in M'^, we say vl >V2 and only if each coordinate of vl 
is greater than or equal to the corresponding coordinate of V2. For a set A C the down-set A (A) is 
defined as: A(yl) = {'v^ G M*^ : 'v^ <lij for some Iv G A}. 

Definition 2: Let CM{Q{y, z\x)) denote Marton's inner bound on the channel q{y, z\x). CM{Q{y, z\x)) 
is defined as the union over of non-negative triples {Rq, Ri, R2) satisfying equations (TJ |2j 13 and |4] over 
random variables U, V, W, X, Y, Z, having the joint distribution p{u, v, w, x, y, z) = p{u, v, w, x)q{y, z\x). 
Please note that the auxiliary random variables U, V and W may be discrete or continuous random 
variables. 

'The conjecture is as follows: [Conjecture 1 15]: Given any five random variables U, V, X, Y, Z satisfying I{UV; YZ\X) = 0, 
the inequality I{U;Y) +I(V;Z) - I(U;V) < max {I(X;Y), I{X; Z)) holds whenever X, Y and Z are binary random 
variables and the channel p{y, z\x) is BSSC with parameter i. 
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Definition 3: Let CM~i{q{y, z\x)) be a subset of defined as the union of 

A{{{I{W; Y),I{W; Z), I{UW; Y),I{VW; Z), 
I{U; Y\W) + I{V; Z\W) - I{U- V\W) + I{W- Y), 
I{U; Y\W) + I{V; Z\W) - I{U; V\W) + I{W; Z)) }) , 

over random variables U, V, W, X, Y, Z, having the joint distribution p{u, v, w, x, y, z) = p{u, v, w, x)q{y, z\x). 
Note that the region CM-iiliy, z\x)) specifies Cj\/(g(y, zjx)), since given any p{u,v,w,x,y, z) = 
p{u,v,w,x)q{y, z\x) the corresponding vector in CM-iiliy, z\x)) is providing the values for the left 
hand side of the 6 inequalities that define the region Cj\/(g(y, 2;|x)). Cj\/_/(g(y, is defined as a 
subset of M^, and not for technical reasons that will become clear later. 

S S S 

Definition 4: The region C^^' {q{y, z\x)) is defined as the union of non-negative triples {Rq, Ri,R2) 
satisfying equations[T][2j[3]and|4l over discrete random variables U, V, W, X, Y, Z satisfying the cardinality 
bounds < Su, |V| < and |W| < 5^, and having the joint distribution p{u,v,w,x,y,z) = 
p{u, V, w, x)q{y, z\x). Note that C^^'^"'^" {q{y, z\x)) C C^]"'^"''^™(g(y, z\x)) whenever Su < S'^, < S'^ 
and Sw < S'^. 

The region C^^^''^™ {q{y, z\x)) is defined as the union of the six tuple mentioned in Definition [3l 
Note that the region C^^^'^™ {q{y, z\x)) specifies C^j'^"'^™ {q{y, z\x)), over discrete random variables 
U,V,W,X,Y,Z satisfying the cardinality bounds < Su, \V\ < S^ and |>V| < Suj, and having the 
joint distribution p{u, v, w, x, y, z) = p{u, v, w, x)q{y, z\x). 

Definition 5: Let ^{q{y,z\x)) be equal to C^,j {q{y,z\x)), and ^i{q{y, z\x)) be equal to 

c[ff\^\-\-\qiy,z\x)). 

The region 'i^{q{y, z\x)) is defined as the union over discrete random variables U,V,W, X,Y, Z 
satisfying the cardinality bounds \L{\ < \X\, \V\ < \X\ and |W| < jA:"! + 4, and having the joint 
distribution p{u,v,w,x,y, z) = p{u,v,w,x)q{y, z\x) for which H{X\UVW) = 0, of non-negative 
triples {Rq, Ri, R2) satisfying equations [TJ [H [3] and IH Please note that the definition of 'tf {q{y , z\x)) 
differs from that of ^{q{y,z\x)) since we have imposed the extra constraint H{X\UVW) = on the 
auxiliaries. '^{q{y, z\x)) is a computable subset of the region CM{q{y, z\x)). The region 'io'i{q{y, z\x)) is 
defined similar to ^i{q{y, z\x)) but by adding the extra constraint H{X\UVW) = on the auxiliaries. 

Definition 6: Given broadcast channel q{y, z\x), let Ci\iE{q{y, z\x)) denote the union over random vari- 
ables U, V, W, X, Y, Z, having the joint distribution p{u, v, w, x, y, z) = p{u)p{v)p{w\u^ v)p{x\u, v, w)q{y, z\x), 
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of 

Ro,Ri,R2 > 0; 

Ro < mm{I{W;Y),I{W;Z)y, 

Ro + Ri < I{UW-Y); 

R0 + R2 < i{vw-z)- 

R0 + R1 + R2 < i{uw-Y) + i(y;z\uwy, 

R0 + R1 + R2 < IiVW;Z)+I{U;Y\VW). 

CNE{Q{y,z\x)) is shown in [11] to be an outer bound to the capacity region of the broadcast channel. 
Recently there has been a series of outer bounds on the broadcast channel [1 1][12][13][14]. Among these 
outer bounds, only the outer bound of Nair and El Gamal [11] is computable as no cardinality bounds 
are known for the other outer bounds. It was shown in [15] that the following region is an alternative 
characterization of the set of triples (0, i?i,i?2) in CNEiliu, z\x)): the union over random variables 
U, V, X, Y, Z having the joint distribution p{u, v, x, y, z) = p{u, v, x)q{y, z\x), of 

R11R2 ^ 0; 

Ri < I{U;Y)- 

R2 < i{V;zy, 

R1 + R2 < I{U;Y)+I{V;Z\Uy, 

R1 + R2 < I{V;Z)+I{U;Y\V). 
Definition 7: Given any finite random variable X, and real valued random variable L where |E[L|X = 
x]| < 00 for all X ^ X, Hl[X) is defined as 

Hl{X) = Y,p{X = x)nL\X = x\ log ^_ 

The motivation for defining Hi{X) will become clear later. Note that Hi{X) is linear in E[L|X = x] 
and in L, and can in general become negative. If L is a constant random variable equal to 1, Hl[X) 
reduces to the Shannon's entropy. 

Given finite random variables X and Y, and real valued random variable L where |E[L|X = a;,y = 
y]| < 00 for all X e X and y e y, Hl{X\Y) and Il{X;Y) are defined as follows: Hl{X\Y) = 

DRAFT 



7 



T.yeyP(^ = y)HL{X\Y = y), where 



Hl{X\Y 



y) = Yl Pi^ = = y)nL\X = x,Y = y] log 



1 



p(X = x\Y = y) 



and 



Il{X;Y) 



p{X = x,Y = y)K[L\X = x,Y = y]log 



p{X = x,Y = y) 
p{X = x)p{Y = y)' 



It can be verified tiiat Il{X; Y) = Hl{X) - Hl{X\Y) = Hl{Y) - Hl{Y\X). 



III. Description of the main technique 



In tliis section, we demonstrate tlie main idea of the paper. In order to show the essence of the proof 
while avoiding the unnecessary details, we consider a simpler problem that is different from the problem 
at hand, although it will be used in the later proofs. In the following discussion, we assume that the 
reader has read Lemma [3] from Section |lVl 

Given a broadcast channel q{y,z\x) and an input distribution p{x), let us consider the problem of 
finding the supremum of 



over all joint distributions p{uv\x)p{x)q{y, z\x) where A and 7 are arbitrary non-negative reals, and 
auxiliary random variables U, V have alphabet sets satisfying \U\ < Su and |V| < Sy for some natural 
numbers 5„ and 5^. For this problem, we would like to show that it suffices to take the maximum over 
random variables U and V with the cardinality bounds of min(|A'|,5u) and mindA"!, S"!,). It suffices to 
prove the following lemma: 

Lemma 1: Given an arbitrary broadcast channel q{y,z\x), an arbitrary input distribution p{x), non- 
negative reals A and 7, and natural numbers Su and Sy where Su > 1*^1 the following holds: 



where random variables U, V, X, Y, Z satisfy the following properties: the Markov chain UV — > X — > 
YZ holds; the joint distribution of X ,Y , Z is the same as the joint distribution of X, Y, Z, and furthermore 



/([/; Y) + I{V; Z) - I{U- V) + XI{U; Y) + -iI{V; Z) 



^^Vuv^x-.YZM<s^-m<s. liU; Y) + I{V; Z) - I{U; V) + \I{U; Y) + 7/(^5 Z) 
nU; Y) + I{V; Z) - I{U; V) + XI{U; Y) + ^I{V; Z), 



\U\ < Su, \V\ < 5, 
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Proof. Since the cardinalities of U and V are bounded, one can show that the supremum of I{U\ Y) + 
I{V; Z) — I{U ; V) + XI{U; Y) + jliV; Z) is a maximurry, and is obtained at some joint distribution 
Pq{u, V, X, y, z) = po{u, V, x)q{y, z\x). If < Su, one can finish the proof by setting {U, V, X, Y, Z) = 
{U,V, X,Y, Z). One can also easily show the existence of appropriate {U ,V , X ,Y , Z) if p{u) = for 
some u ^U. Therefore assume that \U\ = Su and p{u) ^ for all u ^ U. Take an arbitrary non-zero 
function L: UxVxX^^ where ¥,[L{U.,V, X)\X]-0. Let us then perturb the joint distribution of 
U, y, X, Y, Z by defining random variables U,V,X,Y and Z distributed according to 

= u,V = V, X = x,Y = y, Z = z) = 

Po{U = u,V = v,X = x,Y = y,Z = z)- {l + e- E[L{U, V, X)\U = u,V = v, X = x,Y = y, Z = z]) , 

or equivalently according to 

Pe{U = u,V = V, X = x,Y = y, Z = z) = 

Pq{U = u,V = V, X = x,Y = y, Z = z)(^l + e ■ L{u, v, x)) = 

Po{U = u,V = V, X = x)q{Y = y , Z = z\X = x) [l + e ■ L(u, V , x)) . 

The parameter e is a real number that can take value in [— ei,e2] where ei and €2 are some positive 
reals representing the maximum and minimum values of e, i.e. m.mu^v,x 1 — ei • L{u, v, x) = miriu 1 + 
62 • L{u,v,x) = 0. Since L is a function of U, V and X only, for any value of e, the Markov chain 
UV X ^ YZ holds, and p(Y = y/Z = z\X = x) is equal to q(Y = y, Z = z\X = x) for all x, y, z 
where p{X = x) > 0. Furthermore E[L({7, y, = implies that the marginal distribution of X is 

preserved by this perturbation. This is because 

p,iX = x)= po{X = x) ■{! + €■ E[L{U, V, X)\X = x]) . 

This further implies that the marginal distributions of Y and Z are also fixed. 1^ 

The expression I{U; Y) + I{V; Z) - I{U; V) + XI{U; Y) + jI{V; Z) as a function of e achieves 
its maximum at e = (by our assumption). Therefore its first derivative at e = should be zero, and 

^Since the ranges of all the involving random variables are limited and the conditional mutual information function is 
continuous, the set of admissible joint probability distributions p{u,v,x,y,z) where I{UV ;Y Z\X) = and p{y,z,x) = 
q{y, z\x)p{x) will be a compact set (when viewed as a subset of the Euclidean space). The fact that mutual information function 
is continuous implies that the union over random variables U, V, X, Y, Z satisfying the cardinality bounds, having the joint 
distribution p{u, v, x, y, z) — p{u, v\x)p{x)q{y , z\x) , of I{U ; Y) + I{V; Z) — I{U; V) + \I{U ; Y) + "iI{V; Z) is a compact 
set, and thus closed. 

^The terms E[L((7, V, X)\Y] = and E[L(?7, V, X)\Z] = must be zero if E[L{U, V, X)\X] = 
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its second derivative should be less than or equal to zero. Using Lemma [3l one can compute the first 
derivative and set it to zero, and thereby get the following equation: 

Il{U; Y) + Il{V; Z) - Il{U; V) + XIl{U; Y) + jliiV; Z) = 0. (5) 

In order to compute the second derivative, one can expand the expression as entropy terms and use 
Lemma[3]to compute the second derivative for each term. We can use the assumption that E[L(J7, V, X)\X] = 
(which implies E[L(C/, =0 and E[L(C/, V, X)\Z] = 0) to simplify the expression. In particular 

the second derivative of H{Y) and H{Z) at e = would be equal to zero (as the marginal distributions 
of Y and Z are preserved under the perturbation), the second derivative of I{U; y) at e = will be equal 
to -loge-E[E[L{U,V,X)\U]^]+loge-E[E[L{U,V,X)\UY]% the second derivative of I{V;Z) at e = 
will be equal to - log e • E[E[L{U, V, X)\V]'^] + log e • E[E[L{U, V, X)\VZ]'^], and the second derivative 
of - / ( [/ ; y ) at e = will be equal to + log e • E [E [L ( [/, F, X ) I [/] 2 ] + log e • E [E [L ( f/, F, X) I y ] 2 ] - log e • 
E[E[L{U, V, X)\UV]'^]. Note that the second derivatives of I{U; Y) and I{V; Z) are always non-negative. 
Since the second derivative of the expression I{U; Y) + I{V; Z) - I{U; V) + XI{U; Y) + jKV; Z) at 
e = must be non-positive, the second derivative of I{U ; Y) + I{V; Z) — I{U; V) must be non-positive 
at e = 0. The second derivative of the latter expression is equal to +loge • E[E[L(C/, F, X)|[/y]2] + 
loge • E[E[L{U, V,X)\VZf\ - loge • E[E[L(C/, V,X)\UVf\. Hence we conclude that for any non-zero 
function L:Z^xVx^^M where E[L(C/, V,X)\X\=^ we must have: 

E[E[L(C/,F,X)|[/y]2] +E[E[L(C/,y,X)|yZ]2] -E[E[L([/,F,X)|C/y]2] < 0. (6) 

Next, take an arbitrary non-zero function L' : — > M where E[L'(C/)|X] = 0. Since = Su > \'^\, 
such a non-zero function L' exists. Note that the direction of perturbation L' being only a function of U 
implies that 

Pe{U = u,V = v,X = x,Y = y, Z = z) = 
Pe{U = u)p(j{V = V, X = x,Y = y, Z = z\U = u) 

In other words, the perturbation only changes the marginal distribution of U, but preserves the conditional 
distribution of po{V = v , X = x ,Y = y , Z = z\U = u) . 
Note that 

E[E[L'(C/)|C/y]2] = E[E[L'(f/)|C/y]2] = E[L\Uf]. 

This implies that E[E[L'(C/)|yZ]2] should be non-positive. But this can happen only when E[L'{U)\VZ] = 
0. Therefore any arbitrary function L' : U ^ R where E[L'{U)\X] = must also satisfy E[L'{U)\VZ] = 
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0. In other words, any arbitrary direction of perturbation V that is a function of U and preserves the 
marginal distribution of X, must also preserve the marginal distribution of 

We next show that the expression I{U; Y) + liV; Z) - I{U; V) + XI{U; Y) + ^I{V; Z) as a function 
of e is constant^ Using the last part of Lemma [3l one can write: 

I{U- Y) = I{U- Y) + €- Il{U; Y) - E[r(e • E[L\U])] - E[r(e • E[L|y])] + E[r(e • E[L\UY])] = 

I{U;Y) + e-lL{U;Y) (7) 

where r{x) = (1 + x)log(l + x). Equation © holds because E[L\Y] = and E[L\U] = E[L\UY]. 
Similarly using the last part of Lemma [3l one can write: 

I{U;V) = I{U;V) + € ■ Il{U;V) -E[r{e - Emu])] - E[r {e ■ E[L\V])] + E[r {e ■ E[L\UV])] = 

I{U;V) + e-lL{U;V) (8) 

where r{x) = (1 + x) log(l + x). Equation ^ holds because E[L\V] = and E[L\U] = E[L\UV]. One 
can similarly show that the term I{V; Z) can be written as I{V\ Z) + e ■ IiiV', Z) = 0. Therefore the 
expression I{U; Y) + I{V; Z) - I{U; V) + XI{U; Y) + -fI{V; Z) as a function of e is equal to 

I{U; Y) + I{V; Z) - I{U; V) + XI{U; Y) + jI{V; Z) + 
e ■ {Il{U; Y) + Il{V; Z) - Il{U; V) + XIl{U; Y) + jIl{V; Z)). (9) 

Equation © implies that this expression is equal to I{U; Y) + I{V; Z)-I{U; V) + XI{U] Y) + -^I{V; Z). 

Therefore the expression /([/; Y) + I{V; Z) - I{U; V) + XI{U; Y) + -fI{V; Z) as a function of e is 
constant. Since the function L' is non-zero, setting e = — ei or e = 62 will result in a marginal distribution 
on U with a smaller support than U since the marginal distribution of U is being perturbed as follows: 

pS = u)=Po{U = u)-{l + eL'{u)). 

This perturbation does not increase the support and would decrease it by at least one when e is at its 
maximum or minimum, i.e. when e = — ei or e = e2. Therefore one is able to define a random variable 
with a smaller cardinality as that of U while leaving the value of I{U;Y) + I{V;Z) — I{U\V) + 
XI{U; Y) + jI{V; Z) unaffected. 

Discussion: Aside from establishing cardinality bounds, the above argument implies that if the maxi- 
mum of I{U; Y) + I{V] Z) - I{U; V) + XI{U; Y) + jI{V; Z) is obtained at some joint distribution 

^Note that = v,Z = z)= po(l^ = w, Z = z) ■ (l + e ■ E[L(t/, V, X) |y = = 2]) = po(l^ = «, Z = 2). 
^The authors would like to thank Chandra Nair for suggesting this shortcut to simplify the original proof. 
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Po{u,v,x,y, z) = po{u,v,x)q{y, z\x), equations |5] and |6] must hold for any non-zero function L : 
U X V X X ^ M. where E[L{U, V, X)\X] = 0. The proof used these properties to a limited extent. 

IV. Statement of results 
Theorem 1: For any arbitrary broadcast channel q{y,z\x), the closure of CMiliy, z\x)) is equal to 

Corollary 1: CM{Q{y, z\x)) is closed since 'tf{q{y, z\x)) is also a subset of CM{Q{y, z\x)). 
Lemma 2: For any arbitrary natural numbers Su, and S^, the following statements hold: 

• C^:^/'^"'iq{y, z\x)) is a closed subset of R^; 

• Cl;fj'^"'{q{y,z\x)) is a subset of cf;i^"'''^'"^''(g(y, z|x)); 

• ^M-i IS convex. 

Lemma 3: Given any finite random variable X, and real valued random variable L where |E[L|X = 
< cxD for all x G A", and E[L] = 0, let random variable X be defined on the same alphabet set 
as X according to Pe{X = x) = po{X = • (l + e • E[L|X = x]), where e is a real number 
in the interval [—€1,62]. ei and £2 are positive reals for which min^; 1 — ei • E[L|X = x] > and 
min^ 1 + £2 • E[L|X = x] > hold. Then 

1) H{X) |,=o= H{X), and i,H{X) |,=o= Hl{X). 

2) Ve G (— ei, 62), -^H^X) = — log e • E[ ^^|;^j^|^j ] = — log(e) • /(e) where the Fisher Information 

/(e) is defined as /(e) = ^^logg {pe{X = x))^ p^{X = x). In particular ^H{X) \e=Q= 
-loge-E[E[L|X]2]. 

3) //(X) = //(X) +e//L(X) -E[r(e-E[L|X])] where r(x) = (1 + x) log(l + x). 

A. On binary input broadcast channels 

In this section, we study binary input broadcast channels, that is when \X\ =2. It therefore suffices to 
consider binary random variables U and V. The cardinality of W would be six and X can be taken to be 
a deterministic function of {U, V, W). Still, the region is hard to evaluate. We however demonstrate that 
the computation can be greatly simplified if we make the extra assumption that Ca/ (^'(y, ^^l^;)) and the 
recent outer bound of Nair and El Gamal, CNE{q{y, A^))' match at the given broadcast channel q{y, z\x). 
We demonstrate this by computing maximum of the sum rate Ri + R2 over all triples {Rq, Ri, R2) in 
CA/(g(y, -z|x)). For simplicity, we assume that for any y ^ y and z ^ Z, p{Y = y\X = 0), p{Y = 
y\X = 1), p{Z = z\X = 0) and p{Z = z\X = 1) are non-zero. This is a mild assumption since an 
arbitrarily small perturbation of a broadcast channel would place it in this class. 



DRAFT 



12 



Theorem 2: Take an arbitrary binary input broadcast channel q{y, z\x) such that for all y G 3^ and 
z £ Z, q{Y = y\X = 0), q{Y = y\X = 1), q{Z = z\X = 0) and q{Z = z\X = 1) are non- 
zero. Assuming that CA/(g(y, z|x)) = CNE{Q{y, z\x)), maximum of the sum rate Ri + R2 over triples 
{Ro, Ri,R2) in the Marton's inner bound is equal to 

max (^min^e[o,i] (max ^(^^^^(^^ ^|,) iHW; ¥) + {!- 7)/(VF; Z) + Z^,PHTip{X = 1\W = w))), 

|W| = 2 

max ^ , , . , . , , I{U;Y) + I{V;Z)], (10) 

p(u,v)p(x\uvjq{y, z\x) \ i / \ j / ii 

\U\ = |V| = 2,/(C/;y) = 0,H{X\UV) = 

where T{p) = max {l{X;Y), I {X; Z)\P{X = l)=p}. 

Remark 1: The expression given in equation [10] is always a lower bound on the maximum of the sum 
rate Ri + R2 over triples (i?o, -Ri, -R2) in the Marton's inner bound whether CMiq{y, z\x)) is equal to 

CNEiqiy,z\x)) or not. 

Corollary 2: Take an arbitrary binary input broadcast channel q{y,z\x) such that for all y G and 
z e Z, q{Y = y\X = 0), q{Y = y\X = 1), q{Z = z\X = 0) and q{Z = z\X = 1) are non-zero. If 
the expression of equation [10] turns out to be strictly less than the maximum of the sum rate Ri + R2 
over triples {Rq, Ri, R2) in Cj\[E{Q{y, z\x)) (which is given in [15]), it will serve as an evidence for 
Cuiqiy, z\x)) 7^ CNEiliy, z\x)). The maximum of the sum rate Ri + R2 over triples {Rq, Ri, R2) in 
CNEigiy, z\x)) is known to be [15] 

max mill (/([/; Y) + I{V; Z),I{U; Y) + /(T/; Z\U),IiV; Z) + I{U; Y\V)) , 

p{u, V, x)q(y, z\x) 

which can be written as (see Bound 4 in [15]) 

max min (/([/; Y) + I{V- Z), /([/; Y) + I{X; Z\U),I{V; Z) + I{X; Y\V)). 

p{u, V, x)q{y, z\x) 

\U\ = |V| = 3,I(U;V\X) = 

There are examples for which the expression of equation [10] turns out to be strictly less than the maximum 
of the sum rate Ri +R2 over triples {Rq,Ri, R2) in C^Eiqiy, z\x)). For instance given any two positive 
reals a and (3 in the interval (0,1), consider the broadcast channel for which \X\ = |3^| = \Z\ = 2, 
p{Y = 0|X = 0) = a, p{Y = 0|X = !)=(), p{Z = 0|X = 0) = 1 - /3, p{Z = 0|X = 1) = 1 - a. 
Assuming a = 0.01, Figure [1] plots maximum of the sum rate for CNsiQiy, z\x)), and maximum of the 
sum rate for CM{q{y, z\x)) (assuming that CNEiliy, z\x)) = CuiQiy, -^l^^))) as a function of (3. Where 
the two curves do not match, Nair and El Gamal's outer bound and Marton's inner bound can not be 
equal for the corresponding broadcast channel. 



DRAFT 



13 



Sum rate curves for a=0.01 




Fig. 1. Red curve (top curve): sum rate for CNE{q{y, z\x)); Blue curve (bottom curve): sum rate for CM[q(y, z\x)) assuming 
that CNE{q{y,z\x)) = CM{q(y.,z\x)). 



V. Proofs 



Proof: [Proof of Theorem[Tl In appendices HIl and Hill we prove that the closure of CM{Q{y, z\x)) is 
J5„,5„,5„>0^m" 



C" Q C* 

equal to the closure of ^ ^ >o^m" "(9(2/) -^k)). and that "^((/(y, z|x)) is equal to z|a;)). 



s s s 

Therefore we need to show that the closure of ^ ^ >o'^Ar " is equal to 

It suffices to prove that ^{q{y,z\x)) is closed, and that for any arbitrary natural numbers Su, and 

S S S \ I 

Sw, C]^^" ™{q{y, z\x)) C ^{q{y^\x)). The former can be proven using Lemma |2] according to which 
the region ^j{q{y, z\x)) is closed^ To show the latter, it suffices to prove that 



*The region ^i{q{y, z\x)) determines Jif(q{y,z\x)). In order to show that the closedness of ^i{q{y, z\x)) implies the 
closedness of J!f{q{y,z\x)), take a convergent sequence {Roa, Ri,i, R2,i) in ^{q{y,z\x)). We would like to show that 
(i?o, -Ri, -R2) = iirai^oo{Ro,i, Ri,i, R2,i) belongs to ^{q{y,z\x)). The six-tuple {Ro,i, Ro,i, Ro,i + Ri,i,Ro,i + R2,i,Ro,i + 
Ri,i + R2,i,Ro,i + Ri,i + R2,i) is in J>fi{q{y, z\x)). Since ^i{q{y, z\x)) is closed, limi^ao{Ro,i, Ro,i, Ro,i + Ri,i,Ro,i + 
R2,i,Ro,i + Ri,i + R2,i,Ro,i + Ri,i + R2,i) = (R^,R^,R^ + 'K,Ro + R^,R^+ R^ + R^,R^+ Rl +R^) is also in 
^i{q{y,z\x)). Thus, {Ro,Ri,R2) = limi^oo(_Ro,i, -Ri.i, -^2,0 belongs to ^{q{y, z\x)). 
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C^'lf^''^'"{q{y,z\x)) C z|x))0Lemma|2]shows that the regions C^^l^"''^"'(g(y, z|x)) and ^j{q{y, z 

are closed. Lemma |2] impUes that the region ^j{q{y, z\x)) is convex. In order to prove that 

S S S 

Cj^jij' "'{q{y,z\x)) is a subset of ^j{q{y, z\x)), it suffices to show that for any supporting hyperplane 

s s s s s s 

of ^mLj" {Qiu, z\x)), the half-space delimited by the hyperplane which contains Cjy'jij" "'{q{y,z\x)) 

is contained in the corresponding half-space for ^j{q{y, 

s s s 

A supporting hyperplane of Cj^Y-i' "^{^{ViA^)) identified with constants Ai, A2, Ag and the 
maximum of Yl^i=i^i^'i ^^^^ triples {R[, R'2, R'^) in Cfj^j'^™{q{y,z\x)). We must have Aj > 
for i = 1,2,. ..,6, since if Aj is negative, Ri can be made to converge to —00 causing X^^^i Ajii^ to 
converge to 00, and hence not finite. Our goal is therefore to show that for any non-negative values of 

Aj {i = 1,2, ...,6), the maximum of Yl^=i-^i^i o^^i" ^M^i'^'^i'^iv^^l^)) ^^^^ ^^^^ equal to the 
corresponding maximum over ^j{q{y, z\x)). 

S S S 

First consider the case where A5 = Ag = 0. Let R2, -Re) be a point in Cj^^ij" " {q{y, z\x)) where 

S S S 

the maximum of J2i=i ^i^i over C]^'_f '"{q{y,z\x)) is obtained. Corresponding to {Ri,R2, ...jRe) is 
at least one joint distribution po{u,v,w,x,y, z) = po{u,v,w, x)q{y, z\x) on U,V,W,X,Y, Z where 
1^1 < Su, |V| < 5^ and \W\ < S^, and furthermore the following equalities are satisfied: Ri < 
I{W;Y), R2 < I{W;Z), R^ < I{UW;Y), ... etc. Maximum of J2Li ^i^'i = Y.t=i^iK over 
{q{y, z\x)) must be then equal to Ai • I{W; + A2 • I{W; + A3 • I{UW; + A4 • I{VW; Z). 
Let U = V = X. Clearly I{UW;Y) < I{UW;Y) and I{VW;Z) < I{VW;Z). Hence the maximum 
of Ylii=i^i^i over Cj^]"!/' would be less than or equal to the maximum of Yli=i-^i^i 

over zjx)). The latter is itself less than or equal to the maximum of Ajii^ over 

^M-'/^' -^1^)) Lemma |2] This implies the desired result when A5 = Ag = 0. 

Next consider the case when either A5 > or Ag > 0, or both: we proceed by proving the following 
three equations: 

6 6 

max y < max > Ajii^ (11) 

(H;,...,R4)eCf7^f--^™(g(y,2|x)) ^ (/?i,...,R4)6Cl^if"'^™(g(j/,^|a;)) 

'This is true because {Ro,Ri,R2) being in Cfj''^^''^"^{q{y,z\x)) implies that {Ro,Ro,Rq + Ri,Rq + R2,Ro + Ri + 
R2,Ro + -Ri + -R2) is in Cf.^f/'^"' {q{y, z\x)). If Cf/I"^""^™ (<j(i/, 2|a;)) is a subset of J('i{q{y, z\x)), the latter point would 
belong to Jifi{q{y, z\x)). Therefore [Rq, Ri, R2) belongs to J2f{q{y,z\x)). 

^This is because the closed convex set Jffi(q{y, z\x)) can be expressed as the intersection of its supporting half-spaces, i.e. 
{Ri,R2, ...,i?6) G -^liiliy, z\x)) if and only if for any Ai, A2, Ae, Yl^=i ^iRi is less than or equal to the maximum of 
Yl^=i ^iR'i over triples (R\, R'2, R'q) in J^i{q{y, z\xy). Thus Cf^^'^"''^™ (^(j/) -^^l^;)) is a subset of J/fi{q{y, z\x)) if and only 
if the maximum of ^iR'i over triples i?2, R'a) in (7^u^s„,s„ (^q(^y^ z\x)) is less than or equal to the same maximum 

over ^i{q{y, z\x)). 
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6 6 

max y XiR'i < max > Ajii^ (12) 

{R'„...,R',)&C[^\:f--'-iq(y,z\x)) ^ (R[,...,R',)&C[^h\''^-'-(q{y,z\x)) 

6 6 

max > Aji?' < max > Aji?' (13) 

(i?i,...,R^)eci?i'|-i'^™(,(j/,.|x))^ (/?i,...,K^)eci?ir'''"'+^(9(?/,^k))t^ 

The proof for equation [TT] is provided in Appendix Jl The proof for equation [12] is similar. Equation [13] 
follows from Lemma [2] ■ 
Proof: [Proof of Theorem [2l Maximum of the sum rate Ri + R2 over triples {Rq, Ri, R2) in 
CM{q(,y, z\x)) is equal to 

max I{U;Y\W) + I{V;Z\W)-I{U;V\W) + mm{I{W;Y),I{W;Z)). (14) 

p{u, V, w, x)q{y, z\x) 
\U\ = 2, |V| = 2 
H{X\UVW) = 

The proof consists of two parts: first we show that the above expression is equal to the following 
expression: 

max( max min (/(VF; y), /(W^; Z)) + ^p(ti;)r(p(X = 1|PF = w)), (15) 

V p{wx)q{y,z\x) w 

max I{U;Y)+I{V;Z) 

p{u, v)p(x\uv)q{y, z\x) 
\U\ = |V| = 2, 1{U; V) = 0, H{X\UV) = 

Next, we show that the expression of equation [15] is equal to the the expression given in Theorem [2] 



The expression of equation[T4]is greater than or equal to the expression of equation 



[d! 



For the first part 



of the proof we thus need to prove that the expression of equation [14] is less than or equal to the expression 
of equation [15] Take the joint distribution p{u, v, w, x) that maximizes the expression of equation [14] 
Let U = {U,W) and V = {V,W). Maximum of the sum rate Ri + R2 over triples {Rq,Ri,R2) in 
CNE{q{y,z\x)) is greater than or equal to min (/(c7; y) + I{V; Z), I{U;Y) + I{V; Z\U), I(V; Z) + 
I{U;Y\V)) (see Bound 3 in [15]). Since CNE{Q{y, z\x)) = CM{q{y, z\x)), we must have: 

mill {I{UW; Y) + I{VW; Z),I{UW; Y) + I{VW; Z\UW),I{UW; Z) + I{UW; Y\VW)) < 
I{U; Y\W) + I{V; Z\W) - I{U; V\W) + min(/(VF; Y),I{W; Z)). 

'Consider the following special cases: 1) given W = w, let {U, V) = {X, constant) if /(X; Y\W = w) > I{X; Z\W = w), 
and ([/, V) = [constant, X) otherwise. This would produce the first part of the expression given in Theorem |2] 2) Assume 
that W is constant, and U is independent of V. This would produce the second part of the expression given in Theorem |2] 
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Or alternatively 

min (^max{I{W;Y),I{W;Z)) + I{U;V\W), 
I{W; Y) - min(/(M^; Y),I{W; Z)) + /([/; V\WZ), 
I{W] Z) - min(I(VF; Y),I{W; Z)) + /([/; V\WY)^ < 0. 

Since each expression is also greater than or equal, at least one of the three terms must be equal to zero. 
Therefore at least one of the following must hold: 

1) I{W; Y) = I{W; Z) = and /([/; V\W) = 0, 

2) I{U;V\WY) = 0, 

3) I{U;V\WZ) = 0. 

If (1) holds, /([/; Y\W) + I(V; Z\W) - I{U; V\W) + min(/(M^; Y),I{W; Z)) equals /([/; Y\W) + 
I{V;Z\W). Suppose max^,p(^^-jyQ I{U;Y\W = w) + I{V; Z\W = w) occurs at some w* . Clearly 
/([/; Y\W) + I{V; Z\W) < I{U; Y\W = w*) + I{V; Z\W = w*). Let U, V, X, Y and Z be distributed 
according to p{u, v, x, y, z\w*). I{U; V) = I{U; V\W = w*) = 0. Therefore I{U; Y\W)+I{V; Z\W) - 
I{U; V\W) + mm{I{W; Y), I{W; Z)) is less than or equal to 

max I{U;Y) + I{V;Z). 

p{u,v)p{x\uv)q{y, z\x) 

\U\ = \V\ = 2, 1{U\ V) = 0, H{X\UV) = 

Next assume (2) or (3) holds, i.e. /([/; V\WY) = or I{U; V\WZ) = 0. We show in Appendix |IV] that 
for any value of w where p{w) > 0, either I{U -jVlW = w,Y) = or I{U; V\W = w, Z) = imply that 
I{U;Y\W = w)+I{V;Z\W = w)-I{U;V\W = w) < T{p{X = 1\W = w)). Therefore I{U;Y\W) + 
I{V;Z\W) - IiU;V\W) + mm{I{W;Y),I{W;Z)) < mm{I{W;Y), I{W; Z)) + ^^piw)T{piX = 
l\W = w)). This in turn implies that /([/; Y\W) + I{V; Z\W) - I{U; V\W) + mm{I{W; Y),I{W; Z)) 
is less than or equal to 

max ui{ii{I{W;Y),I{W;Z)) + ^p{w)T{p{X = l\W = w)). 

p(w,x)q{y,z\x) w 

This completes the first part of the proof. 

Next, we would like to show that the expression of equation [15] is equal to the the expression given 
in Theorem [2l In order to show this, we prove that 

max mivL{I{W;Y),I{W]Z)) + ^p{w)T{p{X = l\W = w)) (16) 

p{w,x)q(y,z\x) w 
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is equal to 

min ( max -fI{W;Y) + {1 - -f)I{W; Z) + '^p{w)T{p{X = l\W = w))) . (17) 

7S[0,1] p(wx)q{y,z\x) w 
\W\ = 2 

The expression given in equation [16] can be written as 

max mm {I {W;Y)+Y^p{w)T{p{X = l\W = w)), I{W; Z)+Y^p{w)T{p{X = l\W = w))). 

p{w,x)q{y,z\x) ^ w 

Using Proposition 1 of [16], this expression can be expressed as 

min ( max -fI{W;Y) + {I - -f)I{W; Z) + ^p{w)T{p{X = 1\W = w))) . 

It remains to prove the cardinality bound of two on W. This is done using the strengthened Caratheodory 
theorem of Fenchel and Eggleston. Take an arbitrary p{'w,x)q{y, z\x). The vector w —>■ p{W = w) 
belongs to the set of vectors w —>■ p{W = w) satisfying the constraints YIwPO^ = w) = I, p{W = 
w) > Q and p{X = 1) = Y^wP^-^ = 1\W = w)p{W = w). The first two constraints ensure that 
w — > p{W = w) corresponds to a probability distribution, and the third constraint ensures that one 
can define random variable W, jointly distributed with X, Y and Z according to p{w , x)q{y , z\x) and 
further satisfying p{X = x\W = w) = p{X = x\W = w). Since w piW = w) belongs to the 
above set, it can be written as the convex combination of some of the extreme points of this set. The 
expression - l)H{Z\W = w) - -fH{Y\W = w) + T{p{X = l\W = w))]p{W = w) is linear 

in p{W = w), therefore this expression for w — > p{W = w) is less than or equal to the corresponding 
expression for at least one of these extreme points. On the other hand, every extreme point of the set 
of vectors w — > p{W = w) satisfying the constraints "^^piW = w) = \, p{W = w) > and 
p{X = 1) = Yl,wPi-^ = l\W = w)p{W = w) satisfies the property that p{W = to) / for at most two 
values of w G W. Thus a cardinality bound of two is established. ■ 
Proof: [Proof of Lemma [3 We begin by proving that the region C^f_^f^'"{q{y,z\x)) is closed. 
Since the ranges of all the involving random variables are limited and the conditional mutual informa- 
tion function is continuous, the set of admissible joint probability distributions p{u,v,w,x,y, z) where 
I{UVW;YZ\X) = and p{y,z\x) = q{y,z\x) will be a compact set (when viewed as a subset of 
the Euclidean space). The fact that mutual information function is continuous implies that the union 
over random variables U, V, W, X, Y, Z satisfying the cardinality bounds, having the joint distribution 
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p{u, V, w, X, y, z) = p{u, V, w, x)q{y, z\x), of the six-tuples 

(^liW; Y), I{W- Z),I{UW; Y),IiVW; Z), 
I{U; Y\W) + I{V; Z\W) - I{U; V\W) + I{W; Y), 
/([/; Y\W) + I{V; Z\W) - I{U; V\W) + I{W- Z)^ 

is a compact set. Since the down-set of any compact set in is closecj^, the region cfj^^'^"" {q{y, z\x)) 
must be closed. 

Next we prove that C'^j^j'^'"{q{y,z\x)) is a subset of cf!^^j'^'^^~^'^{q{y, z\x)). Take an arbitrary point 

S S S 

{Ri, R2, Rq) in Cjy'^'^j' ™ {q{y, z\x)). Corresponding to {Ri, Rq) is at least one joint distribution 
Po{u,v,w,x,y, z) = po{u,v,'w,x)q{y, z\x) on U,V,W, X,Y, Z where \U\ < Su, |V| < and |W| < 
Sw, and furthermore the following equations are satisfied: Ri < I{W;Y), R2 < I{W;Z), R^ < 
I{UW;Y), ... etc. Without loss of generality, assume that p{W = w) > for all w. We define U, V 
and W on the same alphabet as U, V and W but will however ensure that p{W = w) ^ for at most 
l^^l + 4 values of w. Random variables U, V and W that we will define are jointly distributed with 
X, y, Z in a way that 

The Markov chain UVWX X ^YZ holds; 

p{U = u,V = v,X = x\W = w)=p{U = u,V = v,X = x\W = w); (18) 

I{W;Y) = I{W;Y); 

IiW;Z) = I{W;Zy, 

I{U;Y\W) = I{U;Y\Wy, 

I{V;Z\W) = I{V;Z\Wy, 

I{U;V\W) < I{U]V\W). 

Please note that proving the existence of random variables U, V and W with the above properties implies 
that the point {R\,R2, -Re) belongs to cfjl^"'''^'^^(g'(?/, z\x)). 

'"in order to show this, let ^ C be a compact set. Take a convergent sequence of points «i, 112, ... in A(^). We would 
like to show that v — limi^oo Vi is in A(^). Corresponding to Vi is a point Wi in A where Wi > Vi. Since A is compact 
the sequence Wi has a convergent subsequence, the limit point of which belongs to A. Let w denote this limit point. Clearly 
w >v, hence v is in A(^). 
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Given that we would like impose the equation p{U = u,V = v, X = x\W = w) = p{U = u,V = 
v,X = x\W = w), defining the marginal distribution of p{W = w) would completely characterize the 
joint distribution p{U = u,V = v,W = w, X = x). 

In order to define the elements of the vector w ^ p{W = w), we first identify the properties that this 
vector needs to satisfy, and then pin down an appropriate vector that has only 1^^1+4 non-zero elements. 

To make sure that the elements of the vector w i-^ p{W = w) corresponds to a probability distribution, 
we impose the following two constraints: 

p{W = w)>0 yw; (19) 

"^p[W = w) = l. (20) 

w 

Since we require that p{X = x\W = w) = p{X = x\W = w), piW = w) must also satisfy the 
consistency equation 

^p{X = x\W = w)p{W = w)=p{X = x)= (21) 

w 

= x\W = 'w)p{W = w) Mx. 

w 

As long as these three equations hold, the joint distribution of p{U = u,V = v,W = w, X = x) will 
be well defined. Equation |2T] seems to be imposing \?(!\ equations on p{W = w). But in fact, one of 
these equations is a linear combination of the rest and equation [20l thus it is redundant. This is because 
= x\W = w) = I. Therefore the equation |2T] imposes \X\ — 1 constraints on p(W = w). 
Next, in order to enforce I{W; Y) = I{W] Y), we require 

^p(yV = w)H{Y\W = w) = ^p{W = w)H{Y\W = w). (22) 

w w 

Please note that because of equation [TSl H{Y\W = w) = H{Y\W = w). Similarly in order to enforce 
liW; Z) = I{W; Z), we require 

= w)H{Z\W = w) = ^p{W = w)H{Z\W = w). (23) 

For I{U]Y\W) = I{U;Y\W) and I{V;Z\W) = I{V;Z\W), we require 

^p{W = w)I{U; Y\W = w) = ^p{W = w)I{U; Y\W = w), (24) 

w w 

and 

^p{W = w)I{V; Z\W = w) = Y,P(.^ = w)I{V; Z\W = w). (25) 

w w 
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Please note that because of equation [TSl I{U;Y\W = w) = I{U;Y\W = w) and I{V;Z\W = w) = 
I{V;Z\W = w). 

In order to enforce I{U; V\W) < I{U; V\W), we require 

^p{W = w)I{U; V\W = w)< ^p{W = w)I{U; V\W = w). (26) 

w w 

Because of equation [HI I{U;V\W = w) = I{U;V\W = w). 

The rest of the proof is based on the technique of Fenchel and Eggleston to strengthen the Caratheodory 
theorem. The region formed by equations \T9\ |20l |2T] [22l |23j |24] and |25] contains the vector w ^ p{W = 
w). The vector w >—>■ p{W = w) further lies in the half space defined by equation l26l We can write the 
vector w i— > p{W = w) as the convex combination of extreme points of the region formed by equations 
[T9l |20l l22l [23] l24l and |25] Since w ^ p{W = w) is in the half space, it must be the case that at least 
one of these extreme points satisfies equation |26] Any such extreme point can have at most + 4 
non-negative elements. This is because any extreme point must satisfy with equality at least |yV| of the 
equations [T9j |20l |22j |23j [24] and [25] The number of equations that do not enforce one of the elements 
of the vector w ^ p{W = w) to zero is I^Yj + 4. Therefore at least |>V| — \X\ — 4 coordinates of an 
extreme point must be zero. Hence the number of non-zero elements is at most \X\ +4. 

S- S I X I "4-4 

It remains to prove that the last part of Lemma[2]is true, i.e. that Cj^^l/ {<l{y, A^)) is convex. Since 

C^jfp^""{q{y, z\x)) is a subset of C^^l^"'''^''''^(g(y, z\x)), it suffices to show that >o^M-i'^"(9(y^ ^\^)) 
is convex. Take two arbitrary points (Ri,R2, Rq) and i?2, iJe) in IJg >o CmLj" ™ {Q{y, z\x)). 
Corresponding to ...,Re) and ...,Re) arejoint distributions po(^^)^^;^^5 a;, y,2;) = po{u,v,w,x)q{y, z 
on U, V, W, X, Y, Z, and po{u,v, w, x, y, i) = po{u,v, w, x)q{y,z\x) on U, V, W, X, Y, Z, where \U\ = 
\U\ = Su, \V\ = \V\ = Sv, and furthermore the following equations are satisfied: Ri < I{W;Y), 
R2 < I{W;Z), R3 < I{UW;Y), Ri < I(W;Y), R2 < I{W;Z), % < I{UW;Y),... 

Without loss of generality we can assume that {U ,V ,W , X ,Y ,Z) and {U,V,W, X,Y, Z) are in- 
dependent. Let Q be a uniform binary random variable independent of all previously defined ran- 
dom variables. Let {U,V,W,X,Y,Z) be equal to {U,V,WQ, X,Y, Z) when Q = 0, and equal to 
{U,V,WQ,X,Y,Z) when Q = 1. One can verify that p{Y = y,Z = z\X = x) = q{Y = y,Z = 
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z\X = x), I{UVW;YZ\X) = 0, and furthermore 



I{W; Y) > ^I{W; Y) + \l{W; Y) 
I{W] Z) > \I{W; Z) + \I{W; Z) 
I{UW;Y) > ^IiUW;Y) + \l{UW;Y) 



Hence {\Ri + \Ri, + 1%, \R(> + ^Re) belongs to Us„>o C&f/'^™ -^k))- Thus 
U5„>o Cf;i^/"^™ {q{y, z\x)) = Cj;r7'''^'+^(g(y, z\x)) is convex. ■ 
Proof: [Proof of Lemma O The equation H{X) = H{X) + eHiiX) - E[r(e • E[L|X])] where 
r{x) = (1 + x) log(l + x) is true because: 

= - ExPo(2)(l + e ■ E[L|X = x]) ■ log (^po{x) • (l + e • E[L|X = 5?])^ 

= - ExPo(x)(l + e • E[L\X = x]) ■ log (^po{x)^ + log (^1 + e • E[L|X = x] 

= H{X) - e Z^poimmX = x] log (^po{x 

EjPo(x)(1 + e • E[L|X = x]) • log (^1 + e • E[L|X = x]^ 
= H{X) + eHL{X) - E[r(e • E[L|X])] . 



Next, note that r(0) = 0, ^r(x) = log(l + x) + log(e) and ^r(x) = i^^. We have: 

^/i'(X) =i/i(X)-E[E[L|X]{log(l + e-E[L|X]) + loge}] = Fl(X)-E[E[L|X] log(l + e-E[L|X])] ^ 

where at e = is equal to Hl{X). 
Next, we have: 



^H{X) = -lE[E[L\X] log(l + 6 • E[L\X])] 
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On the other hand, 



m = ( iiogMx = x)) ] MX = x) 



loge (po{X = x) • (1 + e • E[L\X = x])) ) poiX = x) ■ {l + e ■ E[L\X = x\) 



E. 



(^f log, (1 + e • E[L\X = x])^ po{X = x) • (l + e • E[L\X = x]) = 
E. ( ij-gg l xl] ) 'po(^ = X) • (1 + 6 • E[L\X = X]) = 

Z^x l+e-E[L\X=x]i^O{-^ — — ^Ll+e-E[L|X]J • 

■ 

Appendix I 

In this Appendix we prove equation [TT] assuming that A5 > or Ae > 0, or both. Let {Ri, R2, Rq) he 
a point in Cj^'_j' ™ {q{y, z\x)) where the maximum of Ei=i ^i^'i o^^^ ^m'-T ™ i^iV'' A^)) obtainedlii 
Corresponding to i?2, Rq) is at least one joint distribution po(^i; w, x, y, z) = po{u, v, w, x)q{y, z\x) 

"Note that by Lemma[2] Cfj^^/'^"'' {q{y, z\x)) is closed and furthermore XiR'i is bounded from above when Xi > 0. 
Hence maximum of XiR'^ over the region C^/Ij"''^'^ ((/(y, z\x)) is well defined. 
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on U, V, W, X, Y, Z where \U\ < Su, \V\ < Sy and \ W\ < S^, and furthermore the following inequalities 
are satisfied: Ri < I{W;Y), R2 < I{W;Z), R3 < I{UW;Y), ... etc. Maximum of ELi -^i^i over 
Cm- I {q{y,z\x)) must be then equal to Ai • I{W; Y ) + A2 • /(VF; Z) + A3 -liUW; F ) + A4 • I{VW; Z) + 
A5 • (/([/; Y\W)+IiV; Z\W)-I{U; V\W)+I{W; Y))+Xq ■ {l{U; Y\W) + IiV; Z\W)-I{U] V\W) + 
I{W] Z)). We would like to define random variables U, V, W, X, Y and Z jointly distributed according 
to p{u,v,w,x)q{y,^x), and satisfying the following properties: 

. Ai • I{W; y) + A2 • I{W; Z) + A3 • I{UW; + A4 • I{VW; Z) + A5 • (/(?7; Y\W) + I{V; Z\W) - 
I{U; V\W) + I{W; Y)) + Ae • (/(f/; Y\W) + I{V; Z\W) - I{U; V\W) + I{W; Z)) is less than 
or equal to Ai • /(H^; ?) + A2 • liW] Z) + A3 • I{UW; + A4 • I{VW] Z) + A5 • (/(f/; ^11^) + 
I{V; Z\W) - I{U] V\W) + I(W; Y)) + Ag • {10; Y\W) + KV; Z\W) - I{U; V\W) + I(W; Z)). 

• 1^1 = l-^l- 

• |V| = |V|. 

. \W\ = \W\. 

Instead of finding U that takes values in a set of size at most it however suffices to find an appropriate 
U such that for any w, the conditional distribution p{u\w) / for at most \X\ values of u. 

We assume that random variables U, V, W, X, Y and Z are respectively defined on the alphabet sets 
of U, V, W, X, Y and Z. Without loss of generality assume p{W = w) > for all w G W. We assume 
that the joint distribution of W, X, Y, Z is the same as that of W, X, Y, Z. Therefore I{W; Y) = I{W; Y) 
and I{W; Z) = I{W; Z). We therefore need to define p{u,v\w,x) such that 

. For any w G W, A3 • I{U; Y\W = u;) + A4 • I{V; Z\W = u;) + A5 • {I{U; Y\W = w)+I{V; Z\W = 
w) - I{U; V\W = w)) + As • (/([/; Y\W = w) + I{V; Z\W = w) - I{U; V\W = w)) is less than 
or equal to A3 • I{U; Y\W = w) + ■ I{V; Z\W = w) + X5 ■ {I{U; Y\W = w) + I{V; Z\W = 
w) - I{U; V\W = w)) + Ae • {I{U] Y\W = w) + I{V; Z\W = w) - I{U; V\W = w)). 

• |V| = |V|. 

• For any w, p{U = u\W = 7^ for at most \X\ values of u. 
The above statement holds since Lemma [T] of Section |lll] holds. 

'^This is true because Marlon's inner bound depends only on the conditional distribution of U given W, rather than the 
distribution of U itself. More specifically, assume that we are given a random variable U such that for every w G W, there 
is a subset Am of the alphabet set of U satisfying \As\ = \X\, and p{U = u\W = w) — if u ^ As- Assume that 
Aw — {ai2,i, ait5,2, fis.a, Define U' , a random variable taking values from the set {1,2,3,..., \X\}, as follows: 

p{U' = i\W = w,V — V, X = x) — p{U — as.ilW^ — w,V = V, X — x). The alphabet set of U' is of size \X\ and 
furthermore I{U'; V\W) = 10; V\W) and I{U'; Y\W) = I{U; Y\W). 
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Appendix II 

In this Appendix, we prove that the closure of CMi<l{y, ^1^)) is equal to the closure of 

S S S 

Us^ 5„ s^>o^m' " " ('7(2/1 ^k))- Iri order to show this it suffices to show that any triple {Rq, Ri, R2) in 
Cuiqiy, z\x)) is a limit point of Us„,s„,s„>oCm i^iv^ ^1^))- Since (i?o, i?2) is in Cuiqiv, A^))^ 
random variables U, V, W, X, Y and Z for which equations [I] [2j |3] and |4] are satisfied, exist. First assume 
U, V, W are discrete random variables taking values in {1,2, 3,...}. For any integer m, let Um, Vm and 
Wm be truncated versions of U,V and W defined on {1, 2, 3, m} as follows: Um,Vm and Wm are 
jointly distributed according to p(Um = u,Vm = v, Wm = w) = p^|/^7^'v<^|^<'^) for every u, v and 
w less than or equal to m. Further assume that Xm, and Zm are random variables defined on X, y 
and Z where p{Ym = y,Zm = z, Xm = x\Um = u, Vm = v, Wm = w) = p{Y = y,Z = z,X = x\U = 
u,V = v,W = w) for every u, v and w less than or equal to m, and for every x, y and z. Note that the 
joint distribution of Um, Vm, Wm,Xm, Ym and Zm converges to that of U, V, W, X, Y and Z as m (yo. 
Therefore the mutual information terms I{Wm'-, Ym), I{Wm', Zm), I{WmUm', Ym), ■■■ (that define a region 
in C^j^''^(q{y, z\x))) converge to the corresponding terms I(W;Y), I{W;Z), I(WU;Y), ... Therefore 
{Ro,Ri,R2) is a limit point of [Js^,s^,s^>qCm'^"^"' {Q{y, z\x)). 

Next assume that some of the random variables U, V and W are continuous. Given any positive q, 
one can quantize the continuous random variables to a precision q, and get discrete random variables Ug, 

Vq and Wq. We have already established that any point in the Marton's inner bound region correspond- 

s s s 

ing to Uq,Vq,Wq, X,Y, Z is a limit point of g g >o^Af ' " ™ {Q{y^ z\x)). The joint distribution 
of Uq,Vq,Wq, X,Y, Z convcrgcs to that of U,V,W, X,Y, Z as q converges to zero. Therefore the 
corresponding mutual information terms I{Wq;Yq), I{Wq]Zq), I{WqUq;Yq), ... (that define a region 
in CM{(l{y, z\x))) converge to the corresponding terms I{W;Y), I{W;Z), I{WU;Y),.... Therefore 
{Ro,Ri,R2) is a limit point of U5„,5„,5„>o '^m"'^"''^"' (^(2/1 ^k))- 
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In this Appendix, we prove that 'rf {q{y , z\x)) is equal to ^{q{y,z\x)). Clearly '^{q{y, z\x)) C 
.Sf{q{y,z\x)). Therefore we need to show that J^' {q{y , z\x)) C '^{q{y, z\x)). Instead we show that 
^j{q{y, z\x)) C ^/(g(y, It suffices to prove that 'tfj{q{y,z\x)) is convex, and that for any Ai, 



'^This is true because [Rq, Ri,R2) being in J^{q{y, z\x)) implies that (_Ro, Ro, Ro + Ri, Ro + R2, R0 + R1+R2, Ro + Ri + 
R2) is in J^i{q[y, z\x)). If J^i{q(y, z\x))[q(y, z\x)) is a subset of ^£'i{q{y, z\x)), the latter point would belong to ^^i{q(y, z\x)). 
Therefore (i?o, ^1,-^2) belongs to {q{y , z\x)) . 
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A2, Ae, the maximum of Yl^=i^i^i o^^^" triples {Ri, R2, Rq) in ^j{q{y, z\x)), is less than or 
equal to the maximum of J2f=i ^^i^i over triples (i?i,i?2, •••j^e) in '^liQiu^ z\x)). 

In order to show that '^j {q{y , z\x)) is convex, we take two arbitrary points in '^j{q{y, z\x)). Corre- 
sponding to them are joint distributions p{ui, vi, wi, xi, yi, zi) and p{u2, V2,W2,X2,y2, Z2)- Let Q be a 
uniform binary random variable independent of all previously defined random variables, and let U = Uq, 
V = Vq,W = {Wq, Q), X = Xq, Y = Yq and Z = Zq. Clearly H{X\UVW) = 0, and furthermore 
I{W;Y) > l{l{Wi;Yi) + I{W2;Y2)), I{W;Z) > \{l{Wi; Zi) + I{W2; Z2)), .... Random variable W 
is not however defined on an alphabet set of size +4. However, one can reduce the cardinality of W 
using the Caratheodory theorem (as in the proof of part two of Lemma [2]) by fixing p{u, v, x, y, z\w) and 
changing the marginal distribution of in a way that at most 1^^1+4 elements get non-zero probability 
assigned to them. Since we have preserved p{u,v,x,y, zlw) throughout the process, p{x\u,v,w) will 
remain to belong to the set {0, 1} after reducing the cardinality of W. 

Next, we need to show that for any Ai, A2, Xq, the maximum of J2^=i '^i^i o^^r triples (i?i, i?2, R&) 
in ^i{q{y, z\x)), is less than or equal to the maximum of X^^=i Aji?, over triples {Ri, R2, Rq) in 
^i{q{y, z\x)). As discussed in the proof of theorem [T] without loss of generality we can assume Aj is 
non-negative for i = 1, 2, 6. 

Take an arbitrary point {Ri, R2, Rq) in J('j{q{y, z\x)). By definition there exists random variables 
U, V, W, X, Y and Z for which 

Eti < ^1 • HW; y) + A2 • liW; Z) + A3 • liUW; + A4 • I{VW; Z) + (27) 

As • (/([/; Y\W) + I{V; Z\W) - I{U] V\W) + I{W; Y)) + 

Ae • (/(f/; Y\W) + I{V] Z\W) - I{U; V\W) + liW; Z)) . 

Fix p{u,v,w). The right hand side of equation (|27] ) would then be a convex function of p{x\u,v,w)\}^ 
Therefore its maximum occurs at the extreme points when p{x\u, v, w) G {0, 1} whenever v, w) / 0. 
Therefore random variables U ,V ,W , X ,Y and Z exists for which 

Ai • I{W; y) + A2 • liW; Z) + ... + Ae • {l{U; Y\W) + I{V; Z\W) - I{U; V\W) + I{W; Z)) < 

Ai • I{W; F) + A2 • I{W; Z) + ... + Ae • (/(f/; ^1^) + I{V; Z\W) - I{U; V\W) + liW; Z)) 

and furthermore p{x\u,v,w) € {0, 1} for all x, u, v and id where p{u,v,w) > 0. 

'''This is true because I{W; Y) is convex in the conditional distribution p{y\w); similarly I{U; Y\W — w) is convex for any 
fixed value of w. The term I{U ; V| VF) that appears with a negative sign is constant since the joint distribution of p{u, v, w) is 
fixed. 
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Appendix IV 

In this Appendix, we complete the proof of theorem |2] by showing that given any random variables 
U,V,W,X,Y and Z where UVW X ^ Y Z holds, U and V are binary, H{X\UVW) is zero, the 
transition matrices Py\x ^^d Pz\x have positive elements, and for any value of w where p{w) > 0, 
either I{U;V\W = w,Y) = or I{U;V\W = w, Z) = holds, the following inequaUty is true: 

I{U; Y\W = w)+ I{V; Z\W = w) - I{U; V\W = w) < T{p{X = l\W = w)). 

We assume I{U; V\W = w,Y) = (the proof for the case I{U; V\W = w,Z) = Q similar). First 
consider the case in which the individual capacity Cp^^^ is zero. We will then have I{U ■,Y\W = w) = 
and T{p{X = l\W = w)) = I{X-Z\W = w) > I{V;Z\W = w) - I{U;V\W = w). Therefore the 
inequality holds in this case. Assume therefore that Cp^i^is non-zero. 

It suffices to prove the following proposition: 

Proposition: For any random variables U, V, X, Y and Z satisfying 

. UV ^ X ^ YZ, 

. H{X\UV) = 0, 

. = \V\ = \X\=2, 

• for all y G 3^, p{Y = y\X = 0) and p{Y = y\X = 1) are non-zero, 
. I{U;V\Y) = 0, 

one of the following two cases must be true: (1) at least one of the random variables X, U or: V is 
constant, (2) Either U = XorU = l- XorV = XorV = l-X. 

Proof: Assume that neither (1) nor (2) holds. Since H{X\UV) = 0, there are 2^ possible descriptions 
for p{x\uv), some of which are ruled out because neither (1) nor (2) holds. In the following we prove 
that X = U (B V and X = U AV can not hold. The proof for other cases is essentially the same. 

Since Cpy^^ 7^ implies that the transition matrix Py\x has linearly independent rows. This implies 
the existence of yi,y2 £ y for which p{X = 1\Y = yi) / p{X = l\Y = y2)o Furthermore since X 
is not constant, and p(Y = yi\X = 0),p(Y = yi\X = l),p{Y = y2\X = 0) and p{Y = y2\X = 1) are 

''if this is not thie case we have p{X = 1\Y — yi) = p{X — 1\Y — 7/2) for all yi,3/2 G !V- This would imply that X and 
Y are independent. Since X is not constant, independence of X and Y implies that P{Y — y\X = 1) = p(Y — y\X = 0) 
for all y £ y. Therefore the transition matrix Py[x has linearly dependent rows. Hence I{X;Y) — for all p{x). Therefore 
^Pyix ~ which is a contradiction. 
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all non-zero, both p{X = 1\Y = yi) and p{X = 1\Y = 1/2) are in the open interval (0, 1). Note that 
/([/; V\Y) = implies that I(U; V\Y = yi) = and I{U; V\Y = ya) = 0. 

Let Qij = p{U = i,V = j) for i,j G {0, 1}. First assume that X = U (B V. We have 

, p{u = 0,v = 0\y = yi) = ^^^^^p{X = Q\Y = y,), 

. p{u = ^,v = l\y = y,) = ^;^f^p(X = ll^ = y^), 

. p{u=l,v = 0\y = y,) = ^^^;^p{X = 1\Y = y,), 

. p(u=l,v = l\y = y,,) = ^^^;^piX = 0\Y = y,). 

Therefore I{U;V\Y = y^) = for i = 1,2 implies that 

p{u = l,v = l\y = yi) xp{u = 0,v = 0\y = y^) =p{u = 0,v = l\y = yi) xp{u=l,v = 0\y = yi). 
Therefore 

jp{X = 0\Y = y,)' = , ""-l"^'" ^M X = \\Y = yi)\ 



(ao,o + ai,i)^ * (ao,i + ai,o)^ 

or alternatively 

; p{X = 0|y = yi) = ■ p{X = l\Y = yi). (28) 

^0,0 + ^1,1 aifi + ao,i 

Since X is not deterministic, P{X = 0) = ao,o + ai,i and P{X = 1) = ai^o + flo,i are non-zero. Next, if 

either of ao,o or oi^i are zero, it implies that oi^o or oo,i is zero. But this implies that either U ox V we 

constant random variables which is a contradiction. Hence ^""'""^'^ and are non-zero. But then 

equation |28] uniquely specifies p{X = 1\Y = yi), implying that p{X = 1\Y = yi) = p{X = 1\Y = ya) 

which is again a contradiction. 

Next assume that X = U f\V . We have: 



• p{u 


= f),v 


= 0|y 


= Vi) 


= ^""'V piX 


= o|y 


= Vi 


• p{u 




= i|y 


= Vi) 


= P{X 


= o|y 


= Vi 


• p{u 


= l,v 


= 0|y 


= Vi) 


= viX 


= o|y 


= Vi 


• p{u 


= l,v 


= 1|2/ 


= Vi) 


= p{X = l\Y = yi) 







Note that P{X = 0) = ao,o + ^0,1 + is non-zero. Independence of U and V given Y = yi implies 
that 

p{u = 1,1; = l|y = yi) xp{u = 0,v = 0\y = yi) =p{u = 0,v = l\y = yi) x p{u= l,v = 0|y = yi). 
Therefore 

-p{X = 0\Y = yMX = l\Y = y.) = -,p{X = 0\Y = y,)^ 



ao,o + ao,i + ai,o («o,o + «o,i + ai,o)^ 
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or alternatively 

ao,o • P{X = l\Y = vi) = p{X = 0|y = yi), (29) 

If ao,o is zero, either ai^ or ao,i must also be zero, but this implies that either U or V we. constant 
random variables which is a contradiction. Therefore ao,o is non-zero. But then equation |29] uniquely 
specifies p{X = \\Y = yi), implying that p{X = 1\Y = yi) = p{X = 1\Y = ^2) which is again a 
contradiction. 
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